String Methods

In this lecture we are going to be looking at string methods in more detail. To simplify, a method is a function that are bound to a particular type of object. String methods, for example, are functions that only work on strings, List methods are a set of functions that only work on lists and so on. If you read the OOP lecture this should make sense to you already.

To call these methods, I have to introduce you to a new syntax:

    {Object}.{method name}({arguments, if any})

The syntax above is called ‘dot notation’ and this is one of the main ways we can call an objects method.

Let’s look at an example:


In [1]:
print("hello".upper()) # This works
print(str.upper())     # This returns an error, upper needs an argument


HELLO
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-5f8995cc3436> in <module>()
      1 print("hello".upper())
----> 2 print(str.upper())    # returns an error, upper needs an argument

TypeError: descriptor 'upper' of 'str' object needs an argument

So the method 'upper' capitalises a string; "hi" becomes "HI".

Now, remember in the 'Calling Functions' lecture I talking about functions that take zero arguments? Well, at first glance it might look like upper() takes zero arguments, but actually this isn’t true, object methods take themselves as an argument.

So:

“hello”.upper()   ---> "HELLO"

Would look like this (if expressed as a function):

upper(“hello”)   ---> "HELLO"

The syntax is a bit different, but semantically these two things function the same.


In [1]:
print("HELLO".lower())      # This works!
print(str.lower("HELLO"))   # This also works!


hello
hello

Talking of syntax, addition, multiplication and so on are also methods and they can be called using two different bits of syntax:

 "Hello" + "World"
 "Hello".__add__("World")

Just as before these two ways of doing things produce the same result. Anyway, in the rest of this lecture we will be going over some of the various string method in more detail...

Isdigit method


In [1]:
print("Hello".isdigit())
print("99".isdigit())
print("103.2".isdigit())


False
True
False

From the above three lines you can probably get a good idea what the "isdigit" method is doing; This method checks checks to see if each character in the string is a digit (i.e 1,2,3,4,5,6,7,8,9,0) or not. If every single character is a digit the method returns True, False otherwise.

Just because a string methods take strings as input DOES NOT mean they must output strings as well.

Count Method


In [3]:
print("nananananananaBATMAN".count("A")) # Note count is case sensitive.
print("nananananananaBATMAN".count("BATMAN"))
print("nananananananaBATMAN".count("ROBIN"))


2
1
0

The count method shown above can be pretty useful, it takes two arguments and returns the total number of times the second string appears in the first. Do note that it is case-sensitive and it looks for an EXACT match. For example:

"abc".count("ab") --> 1
"acb".count("ab") --> 0

In short, count is looking for instances of the whole pattern and NOT how many times a and b appear individually.

Replace method


In [6]:
text = input("Feed me characters...FEED ME NOW GRRRR!!!  ")
replace_this = input("Now give me a single character to change in the text ")
replace_with = input("What should we change that character to ? ")

print("")
print("============ RESULT =================")
print(text.replace(replace_this, replace_with))


Feed me characters...FEED ME NOW GRRRR!!!  CATS CATS AND MORE CATS!
Now give me a single character to change in the text C
What should we change that character to ? B

============ RESULT =================
BATS BATS AND MORE BATS!

In the code cell above you may notice a new concept, "input". Input basically asks the user (yes thats you) to type in a message. In this particular case we call input three times and store the result in three seperate variables. Once thats done we take the text and replace character X with character Y.

For example:

starting text = “BATMAN”
replace_this = “A”
Replace_with = “Z”
Returns: “BZTMZN” 

Go ahead, why not play with this code for a bit.

Getting side-tracked with Style

Now these variables names are pretty good overall, but readability isn't just about having good function names, it is also about creating names that ‘fit’ together, that is, a naming structure consistent throughout the code.

After a little bit of thought I came up with a much better name than "old_sequence", I swapped it to "replace_this". This new name is not any better in and of itself, but when we juxtapose it alongside "replace_with" it is obviously a more elegant name.

  • replace_this
  • replace_with

‘old_sequence’ although a perfectly reasonable variable name by itself doesn’t show the reader that these two variables are related to one another. Changing ‘old_sequence’ to ‘replace_with’ makes the relationship clearer and as an extra bonus it means our code could almost pass for normal English.

  • text.replace(replace_this, replace_with)
  • text.replace(old_sequence, replace_with)

In the grand scheme of things we are making tiny little changes here, but I’d argue the first example is better. This tiny little change makes my code more beautiful, more elegant, and above all, more readable.

Hopefully this week's homework (#2) will make these concepts more clear to you.

Format

If you read my code snippets you will see that I use format a lot. At heart, format is a way to create strings with 'moving parts' inside them.

For example, if I want to greet the user I probably want some code that returns “hello, {user’s name)”. We can do this with concatenation like so:


In [7]:
name = "chris"
greeting = "hi, " + name

print(greeting)


hi, chris

However, concatenation can become a little cumbersome when we start trying to create strings with several moving parts and/or with different data-types:


In [7]:
name = "chris"
age = 29
no_of_pets = 203
pet = "wombat"

s = "Hi, " + name + " your age is " + str(age) + " and you have " + str(no_of_pets) + " " + pet + "'s. Wow, thats a lot of " + pet + "'s"
print(s)


Hi, chris your age is 29 and you have 203 wombat's. Wow, thats a lot of wombat's

I think you will all agree that the string 's' is getting clunky right now. Its so large that we have to scroll sideways just to see the end of it.

Format to the rescue!

“I have {x} cats and {y}{z}”.format (x, y, z)

Normally when I write syntax I use {} for my own commentary, but on this occasion you need to be aware '{}' is literally what you type in.

Python will then replace the {} with a value, which happens to be the value you give as an argument to format(). So format(x) will insert 'X' into the string. Maybe this is easier understood with actual examples:


In [8]:
s = "I have {} cats and {} {}.".format(2, 3, "dogs")
print(s)


I have 2 cats and 3 dogs.

Let's quickly go back to my 203 pet wombat example. But this time instead of using concatination we shall use the format method.


In [9]:
name = "chris"
age = 29
no_of_pets = 203
pet = "wombat"

s = "Hi, {0} your age is {1} and you have {2} {3}'s. Wow, thats a lot of {3}s".format(name, age, no_of_pets, pet)
print(s)


Hi, chris your age is 29 and you have 203 wombat's. Wow, thats a lot of wombats

Notice that this time instead of empty brackets '{}' we have numbers inside them (e.g '{3}'). This relates to something called indexing (more on this later), but for now, let's just say {3} means Python replaces {3} with the third argument parsed to the format method, in this case it is the variable called 'pet'.

As a minor technical detail, Python counts from zero so the 'third item' is actually the fourth item, if that makes sense.

Here's another example:


In [15]:
"{2} {0} {1} {0} {1} {2} {0} {1} {0} {1} {2}.".format("help","me", "please")

# {0} ==> "help"
# {1} ==> "me"
# {2} ==> "please"


Out[15]:
'please help me help me please help me help me please.'

Homework Assignment #1

Your first homework assignment for this week is to take the variable named "text" (this has been defined for you) and count the number of times "z" occurs AND the letter "k" occurs. Add those numbers up and print the result.

As a further complication, we DO NOT care about case (e.g. ‘z’ and ‘Z’ should both be included in the count).

Don’t feel bad if you struggle, this homework is a step up in difficulty compared to normal. Oh and I have also included a few test cases below that should help you figure out what to do (just in case my instructions were not clear enough).

For bonus difficulty, make your code work for any letter (e.g. "a", "b" returns the count of 'a' + 'A' + 'b' + 'B').


In [ ]:
text = "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzZZZZeeeewwwwwwwwkKewewe2324____23!!!!!fsdffskdsdzzzzZZZZZZZZZZZZZZZZroiooioi"

# Simple Examples (string --> total you should return)
# "zZ"       --> 2
# "kK"       --> 2
# "KZ"       --> 2
# "1aZZZabc" --> 3
# "hello"    --> 0
# "ZzKk"     --> 4


# Your code goes here...

Homework Assignment #2

You are working on a program that has a greeting message and a goodbye message for several languages. Your job is to simply make the code more elegant and readable. Do whatever you think needs doing (note, there isn't really a right/wrong answer here, the aim of this homework is just to make you think about style and readability).


In [ ]:
# Make this more readable...

bye_spain = "buenas noches"
english_greeting = "hello"
english_goodbye = "sod off, lad"
greeting_japanese = "konichiwa"
spanish_greeting = "hola"
hello_in_french = "bonjour"
japanese_bye = "sayonara"